Skip to content

Arm backend: Make the logic for temp_allocation_pool realistic#20064

Open
gggekov wants to merge 1 commit into
pytorch:mainfrom
gggekov:temp_allocation_pool
Open

Arm backend: Make the logic for temp_allocation_pool realistic#20064
gggekov wants to merge 1 commit into
pytorch:mainfrom
gggekov:temp_allocation_pool

Conversation

@gggekov
Copy link
Copy Markdown
Collaborator

@gggekov gggekov commented Jun 5, 2026

  • Move the setting of ETHOSU_MODEL & ETHOSU_ARENA from corstone_utils.cmake to the top-level app CMake so that the linker script is correctly generated. As a result, the .bss.tensor section, used for the temp_allocation_pool, goes into the SRAM instead of the DDR for Shared_Sram.
  • Use 2MB of temp_allocation_pool for Corstone-300 and 4MB-64KB of temp_allocation_pool for Corstone-320.
  • Remove the specify_ethosu_scratch option
  • Change default memory mode for the Ethos-U85 to Dedicated_Sram in order to match how we build the
    arm_executor_runner binary

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell @rascani

Copilot AI review requested due to automatic review settings June 5, 2026 09:59
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Jun 5, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/20064

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 1 Pending, 3 Unrelated Failures

As of commit d11044e with merge base c4e3db0 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 5, 2026
@github-actions github-actions Bot added ciflow/trunk module: arm Issues related to arm backend labels Jun 5, 2026
@gggekov gggekov added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm and removed module: arm Issues related to arm backend labels Jun 5, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 5, 2026

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Arm bare-metal Ethos-U runner configuration to make temp_allocation_pool sizing and placement align with realistic Corstone SRAM constraints, and simplifies the flow by removing the dynamic “derive scratch size from PTE” option.

Changes:

  • Remove the --specify_ethosu_scratch option from scripts/tests and stop deriving scratch size from PTEs.
  • Set more realistic default ET_ARM_BAREMETAL_SCRATCH_TEMP_ALLOCATOR_POOL_SIZE values based on SYSTEM_CONFIG + MEMORY_MODE.
  • Move ETHOSU_MODEL / ETHOSU_ARENA handling to the top-level runner CMake so linker script preprocessing gets the correct values.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
examples/arm/run.sh Removes --specify_ethosu_scratch support and adds extra configure logging.
examples/arm/executor_runner/CMakeLists.txt Adjusts default scratch pool sizing and pre-processes linker scripts with ETHOSU_MODEL/ARENA based on memory mode.
backends/arm/test/test_model.py Removes the PTE-derived scratch sizing option and associated build logic.
backends/arm/test/common.py Updates the default U85 memory mode and extra flags used by tests.
backends/arm/scripts/get_ethosu_scratch_from_pte.py Improves bundled-program detection and adjusts output formatting.
backends/arm/scripts/corstone_utils.cmake Removes ETHOSU_MODEL/ARENA compile definitions from the Corstone helper target.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread examples/arm/run.sh Outdated
@@ -82,7 +81,6 @@ function help() {
echo " --config=<FILEPATH> Ethos-U: System configuration file that specifies system configurations (vela.ini)"
echo " --memory_mode=<MODE> Ethos-U: Memory mode to select from the Vela configuration file (see vela.ini), e.g. Shared_Sram/Sram_Only. Default: 'Shared_Sram' for Ethos-U55 targets, 'Sram_Only' for Ethos-U85 targets"
Comment on lines +301 to +303
# On silicon, for Sram_Only, the model is assumed to be in the SRAM but that
# limits the number of models we test for Sram_Only. For testing coverage &
# consistency, we places the model in the external memory for Sram_Only
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment on lines +41 to +44
# If no mention of BP08(specifier for bundled program), return pte
if len(pte_data) < 8 or pte_data[4:8] != b"BP08":
return pte_data
# bundled program
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer changing that as I am not using get_ethosu_scratch_from_pte.py

- Move the setting of ETHOSU_MODEL & ETHOSU_ARENA from
corstone_utils.cmake to the top-level app CMake so that
the linker script is correctly generated. As a result,
the .bss.tensor section, used for the temp_allocation_pool,
goes into the SRAM instead of the DDR for Shared_Sram.
- Use 2MB of temp_allocation_pool for Corstone-300
and 4MB-64KB of temp_allocation_pool for Corstone-320.
- Remove the specify_ethosu_scratch option
- Change default memory mode for the Ethos-U85 to
Dedicated_Sram in order to match how we build the
arm_executor_runner binary

Signed-off-by: George Gekov <george.gekov@arm.com>
Change-Id: I16ecd991d722b665f0faf4b0ec998427a381fed8
@github-actions github-actions Bot added the module: arm Issues related to arm backend label Jun 8, 2026
Copilot AI review requested due to automatic review settings June 8, 2026 16:32
@gggekov gggekov force-pushed the temp_allocation_pool branch from 90a3b8b to d11044e Compare June 8, 2026 16:32
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment on lines +181 to +202
if(SYSTEM_CONFIG MATCHES "U55")
# The Corstone-300 has 2MB of SRAM, provide the Ethos-U with the full 2MB
# for the temp_allocation_pool array.
set(ET_ARM_BAREMETAL_SCRATCH_TEMP_ALLOCATOR_POOL_SIZE 0x200000)
elseif(SYSTEM_CONFIG MATCHES "U85")
if(MEMORY_MODE MATCHES "Dedicated_Sram")
# 32MB of scratch buffer for Dedicated_Sram memory mode.
set(ET_ARM_BAREMETAL_SCRATCH_TEMP_ALLOCATOR_POOL_SIZE 0x2000000)
# For Dedicated_Sram, set the
# ET_ARM_BAREMETAL_FAST_SCRATCH_TEMP_ALLOCATOR_POOL_SIZE to 384KB unless
# specified otherwise.
if(NOT DEFINED ET_ARM_BAREMETAL_FAST_SCRATCH_TEMP_ALLOCATOR_POOL_SIZE)
set(ET_ARM_BAREMETAL_FAST_SCRATCH_TEMP_ALLOCATOR_POOL_SIZE 0x60000)
endif()

elseif(MEMORY_MODE MATCHES "Shared_Sram" OR MEMORY_MODE MATCHES "Sram_Only")
# For Shared_Sram and Sram only, use scratch buffer of 4MB - 64KB The
# Corstone-320 provides 4MB of SRAM and we subtract 64KB because we have a
# few objects placed at the start of the SRAM, before the
# temp_allocation_pool array.
set(ET_ARM_BAREMETAL_SCRATCH_TEMP_ALLOCATOR_POOL_SIZE 0x3F0000)
endif()
Comment on lines +197 to +200
# For Shared_Sram and Sram only, use scratch buffer of 4MB - 64KB The
# Corstone-320 provides 4MB of SRAM and we subtract 64KB because we have a
# few objects placed at the start of the SRAM, before the
# temp_allocation_pool array.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: arm Issues related to arm backend partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants